Estimating 3D tilt from local image cues in natural scenes
نویسندگان
چکیده
Estimating three-dimensional (3D) surface orientation (slant and tilt) is an important first step toward estimating 3D shape. Here, we examine how three local image cues from the same location (disparity gradient, luminance gradient, and dominant texture orientation) should be combined to estimate 3D tilt in natural scenes. We collected a database of natural stereoscopic images with precisely co-registered range images that provide the ground-truth distance at each pixel location. We then analyzed the relationship between ground-truth tilt and image cue values. Our analysis is free of assumptions about the joint probability distributions and yields the Bayes optimal estimates of tilt, given the cue values. Rich results emerge: (a) typical tilt estimates are only moderately accurate and strongly influenced by the cardinal bias in the prior probability distribution; (b) when cue values are similar, or when slant is greater than 40°, estimates are substantially more accurate; (c) when luminance and texture cues agree, they often veto the disparity cue, and when they disagree, they have little effect; and (d) simplifying assumptions common in the cue combination literature is often justified for estimating tilt in natural scenes. The fact that tilt estimates are typically not very accurate is consistent with subjective impressions from viewing small patches of natural scene. The fact that estimates are substantially more accurate for a subset of image locations is also consistent with subjective impressions and with the hypothesis that perceived surface orientation, at more global scales, is achieved by interpolation or extrapolation from estimates at key locations.
منابع مشابه
The Statistics of Natural Scenes and the Inference of 3D Shape Thesis Proposal
In spite of a great potential benefit to computer vision and image understanding in general, little is known about the statistics of natural 3D scenes. Current algorithms that infer 3D shape from monocular image cues are typically based on simplifying assumptions inspired by physical models of image formation that often fail to hold in natural scenes. My thesis will address both of these issues...
متن کاملThe lawful imprecision of human surface tilt estimation in natural scenes
Estimating local surface orientation (slant and tilt) is fundamental to recovering the three-dimensional structure of the environment. It is unknown how well humans perform this task in natural scenes. Here, with a database of natural stereo-images having groundtruth surface orientation at each pixel, we find dramatic differences in human tilt estimation with natural and artificial stimuli. Est...
متن کاملScaling Laws in Natural Scenes and the Inference of 3D Shape
This paper explores the statistical relationship between natural images and their underlying range (depth) images. We look at how this relationship changes over scale, and how this information can be used to enhance low resolution range data using a full resolution intensity image. Based on our findings, we propose an extension to an existing technique known as shape recipes [3], and the succes...
متن کاملLocal Occlusion Detection under Deformations Using Topological Invariants
Occlusions provide critical cues about the 3D structure of man-made and natural scenes. We present a mathematical framework and algorithm to detect and localize occlusions in image sequences of scenes that include deforming objects. Our occlusion detector works under far weaker assumptions than other detectors. We prove that occlusions in deforming scenes occur when certain well-defined local t...
متن کاملLearning the Statistics of People in Images and Video
This paper address the problems of modeling the appearance of humans and distinguishing human appearance from the appearance of general scenes. We seek a model of appearance and motion that is generic in that it accounts for the ways in which people’s appearance varies and, at the same time, is specific enough to be useful for tracking people in natural scenes. Given a 3D model of the person pr...
متن کامل